# Code for "Measuring Under- and Overreaction in Expectation formation"

Python and STATA code for "Measuring Under- and Overreaction in Expectation formation" by Simas Kucinskas and Florian Peters.

## Replication instructions

Results in the paper can be replicated by running

> ./run_all

on a command line interface (e.g. the Terminal app on Mac OS).
Provided that Python and STATA, and LaTeX, along with the relevant packages, 
are installed, system requirements are met, and system paths to Python STATA are set, 
this bash script should work out of the box on Linux and MacOS machines. The scripts 
will *not* run on Windows. In that case, the scripts should be helpful for understanding
the structure of the code, and the sequence of the analysis. The Python and STATA code itself
works across platforms, including on Windows.

For help on setting system paths to STATA:
https://www.stata.com/support/faqs/mac/advanced-topics/#:~:text=Any%20do%2Dfile%20that%20you,the%20do%2Dfile%20is%20running.

For help on setting system paths in general:
https://wpbeaches.com/how-to-add-to-the-shell-path-in-macos-using-terminal/

The command line call for STATA is version dependent. The bash script is written for STATA MP. If you have STATA SE or STATA BE
please update lines 54-59 in script "run_empirics"

 
## System requirements

Replication files require Python, STATA and LaTeX to be installed and the following packages to be loaded:
For Python: pandas, numpy, statsmodels
For STATA: reghdfe, ivreghdfe and winsor2
For LaTeX: XITS Math (fonts)

We recommend using the Anaconda distribution to get these Python packages and using 

> ssc install <package_name>

to install the necessary STATA packages.


## Sequence of analysis

Python is used to construct the datasets used in the empirical analysis.
The empirical analysis is then carried out with STATA. Then, graphs and
tables are constructed in Python (as well as some additional more
numerically intensive calculations that use the estimates obtained
with STATA, including the calibration exercise).


## Expected runtime

Performing the empirical analysis in run_all takes around 4 hours
on an Ubuntu laptop with 2.3GHz (4 cores) and 8GB RAM. 


## Sub-folder structure

The code that performs the analysis is located in ./measuring_expectations/.

The ./fixed_revisions/ folder holds fixed revisions of the data. This way,
parts of the analysis can be completed without running all of the code.
For example, to run the analysis of SPF data in STATA, one can set the directory 
to ./measuring_expectations/code/stata/ by typing

> cd /measuring_expectations/code/stata/

in the command line and then type in

> do run_all.do

in STATA to get the results. Provided that the relevant
input data is located in the ./fixed_revisions/ folder, the code will run
even if the user does not have Python installed. (Python is used to
construct consensus forecasts.)

## Unit tests

Python unit tests (using nosetests -- which should be installed
for the tests to run) are in ./measuring_expectations/python/tests.
These tests are included in ./run_all bash script. These unit tests
are run as part of the bash scripts.
